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Abstract 

In this paper, we investigate the polynomial integrand of an integral 
formula that yields the expected length of the minimal spanning tree of 
a graph whose edges are uniformly distributed over the interval [0,1]. In 
particular, we derive a general formula for the coefficients of the poly¬ 
nomial and apply it to express the first few coefficients in terms of the 
structure of the underlying graph; e.g. number of vertices, edges and 
cycles. 


1 Introduction 

In 2002, J.M. Steele [7] derived an integral formula for the expected length 
of a minimal spanning tree (MST) of a graph with independent edge lengths 
uniformly distributed over the interval [0,1]. While the formula gives an exact 
value of the mean length of the MST in terms of the Tutte polynomial of the 
graph, it yields (at least to us) little intuition of how the MST relates to the 
structure of the underlying graph. 

This provided the motivation for the research project investigated by the 
Willamette University group of the Willamette Valley REU-RET Consortium 
for Mathematics Research in the summer of 2008. The authors of this paper 
were members of that research group and this paper covers the work that began 
that summer. 

The main result of this paper is a general formula for the coefficients of the 
polynomial integrand in Steele’s formula for the expected length of the MST of a 
simple, finite, connected graph. For the first seven coefficients of the polynomial, 
we prove a surprising result that expresses the coefficients in terms of features 
of the underlying graph; e.g. the number of vertices, edges, and cycles. 

The remainder of this paper is organized as follows: In Section we state 
Steele’s formula, which is written in terms of the Tutte polynomial of the under¬ 
lying graph. In Sectionwe investigate the integrand of the formula and prove 
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that it is a polynomial, expressing the coefficients in terms of characteristics of 
the graph. We illustrate our results with an example in Section and examine 
the particular case of the complete graph in Section 

Throughout this paper, “graph” means a finite simple graph. We adopt the 
usual notations: V (G) and E{G) are the vertex and edge sets of G, respectively. 
The rank of G is denoted by r(G), and r(G) = |1^(G)| — fc(G), where fc(G) 
denotes the number of connected components of G. 


2 Steele’s Formula 


Let G be a graph. We assign independent random lengths with uniform 
distribution over the interval [0,1] to the edges e G E{G). The total length of a 
minimal spanning tree (MST) of the graph G is denoted by 

L(G) = ^ Ce. 

eG£:(MST(G)) 

We are interested in the expected value of L{G), which we denote by E[L(G)]. 

Steele’s formula for E[L(G)] is written in terms of the Tutte polynomial of 
a graph, which we define next. 

Definition 2.1. Let G be a graph, and define S{G) to be the set of spanning 
subgraphs of G; i.e., subgraphs of G with vertex set V{G) and edge set a subset 
of E{G). The Tutte polynomial of G is defined as follows: 


T{G;x,y)= ^ (^ _ l)”(G)-KA)(y _ 

AgS(G) 

The Tutte polynomial of a graph encodes much information about the graph, 
but we will only use the definition above in our analysis and refer the reader to 
[T] for more information. 

We will use the following result about the Tutte polynomial in the proof of 
the main result. The proof is a straightforward calculation from the definition 
and so we omit it. 


Lemma 2.2. Let G be a eonnected graph with n vertices and m edges. Then 
for values of {x,y) satisfying (x — l){y — 1) = 1, we have 


(a) T{G;x,y) = {x - 1)" 


X — 1 


(b) T,{G-x,y) = {x-iy 


( \ ’ 


_AeS(G) 

where T^, denotes the partial derivative ofT with respeet to x. 


We now state Steele’s integral formula for the expected length of the minimal 
spanning tree that was proved in [7]. 
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Theorem 2.3. (Steele’s formula) Let G be a connected graph and T{G; x, y) 
the Tutte polynomial ofG. Then 

E[L(G)] = I'^ 

Steele’s formula above has been generalized to the case of an arbitrary, but 
still identical, edge distribution [5] and to edge distributions that are not nec¬ 
essarily identically distributed [5]. 


r.(Gd.Th) 




dt 


( 1 ) 


3 Integrand in Steele’s Formula 


3.1 Polynomial integrand 

We begin by showing that the integrand in Steele’s integral formula is a poly¬ 
nomial of degree less than or equal to the number of edges in the graph. 

Theorem 3.1. Let G be a connected graph with n vertices and m edges. Then 


E[L(G)]= [ p^{t)dt 
Jo 


where Pmit) is a polynomial of degree less than or equal to m. 


Proof. For convenience, we let |A| = |i?(A)|. By Lemma 2.2 we have 


1 — t ^ 2 : 


t 




= -1 

AgS(G) 


'*-1^1 


= _1+ ^ k(A) (-1) 

AeS{G) i=0 

This establishes the result, but we refine the coefficients further. Define 


1^1 


f.m-j 


m—\A\ 


p^(t) = -i+ Y. KA) Y (- 1 )™-'^'-^ 

AgS(G) j=0 


m — |A| 
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Let i = m — j. Then m — \ A\ — j = i — \ A\, so we have 

m.-\A\ . , 

1^1 V* 


p„(t) = -i+ ^ k{A) Y (-ir 

AeS(G) m-i=0 


r. 


To find the coefficient of P, we sum over all A in ^(G) such that |iii(A)| < i. 
This yields 


= E(-ir 


e=o 


m — i 
m — i 


,) 1 : HA). 

AeSe 


( 2 ) 
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where Se := {A G S{G) : \E{A)\ = £}. Thus 


Pmif) = -1 + ajf 

i=0 


with fli as above. 


□ 


In the proof of Theorem 3.1 we derived an initial formula (HI for the coef¬ 
ficients of the polynomial integrand in Steele’s formula for the expected length 
of the MST. In the next section, we derive an easier working form for the co¬ 
efficients but we end this section with our first main result on the first three 
coefficients. 


Theorem 3.2. Let 

m 

Pm{t) = -1 + 

be the polynomial integrand in Steele’s formula for the expected length of the 
MST of a connected graph G with n vertices and m edges. Then 


oo = n, oi = —m, and 02 = 0 


Proof. The set consists of just the single subgraph of G with no edges and 
n vertices, which has n connected components. Therefore, ~ 

Next, the set Si consists of the m spanning subgraphs with just one edge, each 
of which has exactly n — 1 connected components. Therefore, “ 

m{n— 1). Lastly, the set S 2 consists of (™) spanning subgraphs with two edges, 
each of which has exactly n — 2 connected components. 

Substituting in these values into formula ([^ yields 


AgSq 


tti = —m k{A)+ k{A) = —mn+m{n—l) = — 
agSo AeSi 


m 


and 


02 = 


k{A) - (m - 1) Y + Y 


2 jn — m(m — l)(n — 1) -I- f ^ ) (n — 2) = 0. 


This completes the proof. 


□ 


3.2 Coefficients of the polynomial integrand 

In the previous theorem, the initial formula ([^ for the coefficients is easily 
applied for the cases £ = 0,1, 2, because for each such £, the members of St all 
have the same number of connected components. When k{A) is non-constant 
on St, the enumeration becomes more difficult. 


4 



Accordingly, we partition the set Si into subsets with different numbers of 
connected components. This can be achieved by partitioning over the ranks of 
the members of Si since subgraphs in Si with the same rank have the same 
number of connected components, namely n — r. 

Let kf, be the number of spanning subgraphs of G in Si with rank r; i.e. 
the number of spanning subgraphs of G with i edges and n — r connected 
components. In terms of fcf, formula ([^ can be rewritten as 


e=o 




(3) 


where ri is the minimum rank of a graph with n vertices and £ edges. If Kq is 
the largest complete graph with \E{Kq)\ < £, then ri = q. In other words, ri is 
the largest integer with < £. 

We use the fact that Y^^r=rt = (T) reduce the number of terms of fcf 
by one in ([^. The new general expression for the polynomial coefficients Oi 
for f > 3 is stated in Theorem |3.4| below. The proof of the theorem requires a 
couple of combinatorial identities stated in the following lemma. 

Lemma 3.3. For integers m,k,i and n, 



Theorem 3.4. Let Oi he the coefficients of the polynomial integrand in Steele’s 
integral formula for the expected length of the MST of a connected graph G with 
n vertices and m edges. Then for i > 3 




^=3 


m — I 
m — i 


J2kiie-r). 


r—r£ 


Proof. Summing all the terms fcf for a fixed number of edges £ yields the total 
number of spanning subgraphs in Si, which equals (™). This implies that k^ = 
(7) ~ H ftius from formula (j^ , we get 




e=o 


= E(-i) 


i-e 


e=o 




e=o 


m — £ 
m — i 

m — £ 
m — i 

m — £ 
m — i 


(-1 


e-i 


E -r) + iiffi 

r—r^ \ ^ ' r— 

ki{£-r) + 

r—ri ^ ' 

-1 i 

YkU£-r)+Yi-^y 


e=o 


m — £\ ! in 


m — i 


{n-£) 
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The minimum ranks for £ = 0,1,2 are rg = 0,ri = 1 and r 2 = 2. Therefore, 
for these values of I, the summation on r is empty and reduces to the second 


summation. This and Lemma 3.3 a) yield 


ai = 


1 = 1 . 




+ 






m 


(n — £) 


The second sum equals zero by the Binomial Theorem and Lemma |3.3[ b). □ 

The above result gives a general formula for the coefficients of the polynomial 
integrand in terms of the values kf.. Determining the values of fcf for large i 
poses a major challenge. We conclude this section with the enumeration for 
£ = 3,4, 5 ,6 and the corresponding coefficients of 


Definition 3.5. For a connected graph G, define 

(a) Ci = number of cycles of size i in G. 

(b) Ci^i = number of cycles of size i with one chord. 

(c) Ci^i = number of cycles of size i with one chord and one additional edge 
that is not a chord of the cycle. 

(d) ki = number of complete subgraphs Ki in G. 

(e) kij = number of complete bipartite subgraphs Kij in G. 

Lemma 3.6. Tor 1' = 3,4, 5, 6, 


e-i 




m-j 
m — i 


— df. 


(4) 


where 


ds = 0, di = 0, ds = fcf, de = C4,i + csp + ^ 3,2 + 4A:4 

Proof. We show the above result for € = 5; the other cases are similar in nature. 
The minimum rank for .^ = 5 is rg = 3 and so the left-hand side of equation 
Q is /cf -I- 2 ^ 3 . The types of subgraphs counted in /cf are those with 5 edges 
and n —4 connected components, which have the form shown in Figure l(a)-(c). 
Analogously, there is only one type of subgraph counted in fcf, which is shown 
in Figure 1(d). Note that the graphs in Figures 1 and 2 that are a one-clique 
sum of smaller graphs actually represent families that include subgraphs that 
are disjoint unions of the summands. For example, 1(a) includes K 3 IJ P 2 , where 
K 3 is the complete graph on three vertices and P 2 is a path with two edges. 

Now consider the right-hand side of Q. Start with any 3-cycle and choose 
any other 2 edges in the graph; there are C 3 (™^^) ways to do this. This counts 
all the types of subgraphs draicted in Figure 1(a) and counts the subgraphs 
in Figure 1(d) twice. Figure^ gives a pictorial representation of The 

subgraphs counted by C 4 (start with a 4-cycle and choose any other edge) 

are of the type shown in Figure [^b) and Figure [^d). These are depicted in the 
right-hand side of Figure 
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(a) (b) (c) (d) 

Figure 1: Representations of the subgraphs counted in /cf and /cf 



Figure 2: Representations of the subgraphs counted in 03(^2 and C 4 (’"^ 


fore, 


Lastly, C 5 is the number of 5-cycles, which are shown in Figure [^c). There- 


-f 2^3 — C3 


m — 3 


'm - 4 \ 3 

C 41 I + C 5 — K 3 . 


1 


□ 


While initially Lemma 3.6 appears only to complicate the coefficient for¬ 
mula given in Theorem |3.4| the next lemma shows that when it is applied to 
the coefficient formula, it actually simplifies it. The proof is a straightforward 
calculation and so we omit it; the reasoning is analogous to the proof of Lemma 
|3.6| Although we proved the first equation in Lemma |3.7| for i = 3,4, 5, 6 , we 
conjecture that it holds in general for alH > 3. 

Lemma 3.7. For f = 3,4, 5, 6 , 




e=3 


TO — A 

m — i J ^ 


3=re 


m-j 

.-e 


= Ci 


and thus 

ai = Ci - (5) 

Finally, we derive representations for the coefficients 03 through ae in terms 
of the structure of the underlying graph G. The proof of the theorem is a direct 
application of Lemmas |3.6| and |3.7| to the general coefficient formula given in 
Theorem 13.41 


Theorem 3.8. Let 


Pmit) = -1 -I- ajf 

i=0 
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be the polynomial integrand in Steele’s formula for the expected length of the 
MST of a connected graph G with n vertices and m edges. Then 

0-3 = C3, 04 = C4, 05 = C5 — 06 = C6 + 2^4 — €5,1 — fc 3 _ 2 - 


4 Application of Results 


In this section, we apply Theorems |3.2| and |3.8| to the complete bipartite graph 
^ 3,2 in order to derive the expected length of the minimal spanning tree of this 
graph. 

Proposition 4.1. Letpm{t) be the polynomial integrand in Steele’s formula for 
the complete bipartite graph, K 3 2 shown below. 


Then 



Pmit) = 4 - 6 t + 3t'‘ - 


and 


E[L{K 3 2 )] = / (4 - 6 t + 3t^ - t®) dt = 4 - 3 + I - ^ = 

Jo 5 7 


51 

35 


Proof. For 2 ^ n = 5, and m = 6. By Theorem 3.2 we have Oq = 5, Oi = —6 


and 02 = 0. Next, we apply Theorem 3.8 K 3^2 has no 3-cycles, so 03 = 0. The 
graph has three 4-cycles, so 04 = 3. For the coefficient 05 , we note that there 
are no 5-cycles and also no fcf-type subgraphs (a 4-cycle with a chord) either, 
so 05 = 0. Lastly, for oe, there are no 6 -cycles, no K 4 subgraphs, no cs^-type 
subgraphs (since there are no 5-cycles), and one fc 3 , 2 -type subgraph (the entire 
graph). Therefore, Oe = — 1 and we get 


Pm{t) = —I + 5 — 6t + — t®. 


□ 


5 The Complete Graph 

The MST problem on Kn has been studied extensively. Frieze [3] proved that 

00 

lim E[L(iL„)] = C(3) = V r® = 1.202 ... 

n—>-oo 

2=1 

In [5], Steele extended this result to general edge distributions and Janson [3] 
proved a central limit theorem for L{Kn). 

We apply our results to the complete graph and derive exact formulas in 
terms of the number of vertices n for the first seven coefficients of the polynomial 
integrand in Steele’s formula. 




Theorem 5.1. Let Pm{t) = —1 + polynomial integrand in 

Steele’s formula for the complete graph on n vertices, denoted by Kn. Then 



Proof. For the complete graph on n vertices, the number of edges m = ( 2 ) and 
the number of cycles of length j is given by 


Ci 



(j-1)! 


In addition, fcg = 2 c 4 , 054 = Scs, A :4 = (^) and ^3,2 = (5) (2)- Q 

Numerical calculation of E[L(itr„)] had led to the famous conjecture that the 
convergent sequence is also monotone increasing and concave. This problem was 
raised at the conference Mathematics and Computer Science II at Versailles in 
2002 but no proof has been found. Clearly, our results alone will not answer this 
question as we have only derived exact formulas for the first seven coefficients. 
But our results give a hint that there may be a pattern to the coefficients of the 
polynomial integrand in Steele’s formula for the complete graph, which if true, 
would answer the conjecture. 

We end this section with a result that factors the polynomial integrand in 
Steele’s formula for Kn, with one of the factors a polynomial of degree less than 
or equal to the number of edges of Kn-i- 

Theorem 5.2. Let Pmit) be the polynomial integrand in Steele’s formula for 
the expected length of the MST of the complete graph on n vertices denoted by 
Kn ■ Then 

Pmit) = (1 - t)’’-^q{t), 

where q{t) is a polynomial of degree less than or equal to )■ 

Proof. As in the proof of Theorem |3.1[ we have 

Pm{t)= 

AgS(G) 


where m = ( 2 ). 

We factor out (1 — to get 


Pm{t) 




AgS(G) 


(l-t)l-" 
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Note that ( 2 ) — (n — 1) = (" 2 ^)- the sum ranges over spanning subgraphs 
of size (in edges) from 0 to ( 2 ) ■ We split it into two sums as follows: 


Pra{t) 






|A|>("-i) 


Clearly, the first sum over |A| < (" 2 ^) is a polynomial of degree at most (" 2 ^)- 
Call it qi{t). 

For the second sum, we sum over possible number of edges i > (” 2 ^) and 
count the number of subgraphs with i edges, which for the complete graph is 
(( 2 )). Furthermore, for all spanning subgraphs of Kn with i > (” 2 ^) edges, the 
number of connected components is 1. Therefore, we have 





By the Binomial Theorem, the first sum equals 1 and the second sum, call it 
q 2 {t)-, is a polynomial of degree at most (" 2 ^)- 
We now have 

Pm{t) = + = {l-tY~^{qi{t)+q2{t)), 

where both qi{t) and < 72 (i) are polynomials of degree less than or equal to (" 2 ^)- 
This completes the proof. □ 
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